See copyright notice at the bottom of this page.
List of All Posters
Professor who developed one of computer models for BCS speaks (December 11, 2003)
Discussion ThreadPosted 4:06 p.m.,
December 11, 2003
(#9) -
Jesse Frey
tangotiger,
I'm not sure what specific situation you are thinking of, but I can confirm for you that the Colley method does have the unfortunate property that playing a weak opponent and winning could cost a #1 team its ranking. This can occur even if the game in question is the only game played (i.e. even if there is no reevaluation of past games). This makes it of some interest to USC fans that LSU's game against a D-IA opponent didn't count in the Colley rankings.
The Colley method has this problem because it obtains its rankings just by comparing each team's winning percentage to the strength of its average opponent. The method doesn't correspond to any statistical model that would give you, for example, probabilities for future game results. A method based on a likelihood would almost certainly not have this type of problem.
Professor who developed one of computer models for BCS speaks (December 11, 2003)
Posted 4:59 p.m.,
December 11, 2003
(#11) -
Jesse Frey
tangotiger,
I have a program that can implement the B-T model described in your link (#10). Leaving out D-IAA games just as Colley does, the top 5 teams would be LSU, Oklahoma, Miami(OH), Ohio State, and USC. If you do a bit more of what one might call 'regression to the mean' and assume that each team, rather than simply having a tie with a fictitious average team (as in the link), splits a pair of games with this fictitious team, then the top 5 teams would be Oklahoma, LSU, Miami(OH), USC, and Ohio State. Under either of these methods, winning an additional game could only improve your rating.
Professor who developed one of computer models for BCS speaks (December 11, 2003)
Posted 5:30 p.m.,
December 11, 2003
(#13) -
Jesse Frey
tangotiger,
I don't think that adding a game in which a team plays itself to a tie makes any difference in these ratings. The game or games that are added in need to serve as a penalty to keep undefeated teams from having an infinitely high rating, and a team having a game against itself doesn't do that.
When you add in a win and a loss against a fictitious average team, that loss appears less and less probable the higher the team's rating, and there eventually comes a point, even for an undefeated team, when increasing the rating further would decrease rather than increase the likelihood of the entire set of real and fictitious results.
Professor who developed one of computer models for BCS speaks (December 11, 2003)
Posted 1:02 p.m.,
December 12, 2003
(#27) -
Jesse Frey
tangotiger,
Here is an example to explain the infinity issue. Suppose that there are only 2 teams, A and B. A single game is played, and A defeats B. In the Bradley-Terry model, we assume that there is a hidden merit parameter for each team, say m(A) for team A and m(B) for team B. The probability that team A defeats team B, given these merit ratings, is m(A)/(m(A)+m(B)). Since we observed only that team A defeated team B, we maximize the likelihood of the observed game results by choosing m(A) and m(B) in such a way that the probability m(A)/(m(A)+m(B)) that A defeats B is as large as possible. Since the merit parameters are determined only up to a constant, we may assume that m(B)=1. We then try to maximize m(A)/(m(A)+1). No finite maximizing value m(A) exists, since the bigger we make m(A), the higher the value for m(A)/(m(A)+1). This is the problem that adding in extra game results tries to solve.
Suppose now that we add in a pair of game in which team A defeats itself and then loses to itself. The probability of A defeating B is still m(A)/(m(A)+m(B)). The probability that team A defeats itself is m(A)/(m(A)+m(A)), or 1/2. The probability that team A loses to itself is also 1/2. The likelihood which we choose m(A) and m(B) to maximize is then (1/2)(1/2)(m(A)/(m(A)+m(B))). Since the two factors of 1/2 just give a constant factor of 1/4, we end up in the same situation as in the previous paragraph.
Suppose that, instead of adding in a game in which team A ties itself, we add in, for each of team A and team B, a win and a loss to a team with known rating 1. The probability that A defeats this team is m(A)/(m(A)+1), and the probability that A loses to this team is 1/(m(A)+1). Thus the probability, given m(A) and m(B), that all 5 (1 real, 4 fictitious) games have the results we observed is the product (m(A)/(m(A)+1))*(1/(m(A)+1))*(m(B)/(m(B)+1))*(1/(m(B)+1))*(m(A)/(m(A)+m(B))). What makes this different from adding in games where team A plays itself is that the additional factors involve the parameter m(A). Choosing m(A) and m(B) to maximize this product then gives the ratings m(A)=1.695 and m(B)=0.590 without any further normalizations.
A method for determining the probability that a given team was the true best team in some particular year (January 6, 2004)
Posted 4:45 p.m.,
January 6, 2004
(#5) -
Jesse Frey
AED,
I used the Bradley-Terry model because of its simplicity. It also, from everything I've seen, fits baseball game results quite well. I'm not quite sure what you mean when you say that 'random variations in team performance are Gaussian,' but I have no doubt that a model which used the Gaussian CDF as in your homepage link would give results similar to those I obtained. Is there a way, using your methods, to find analytically the probability that a given team is the best team?
MGL,
I'm not aware of an easier method to find the probability that a given team is the best team. Certainly the rankings of the teams could have been obtained with less effort. Is there a reference you could point me to?